Re-ranking Summaries Based on Cross-Document Information Extraction

نویسندگان

  • Heng Ji
  • Juan Liu
  • Benoît Favre
  • Daniel Gillick
  • Dilek Z. Hakkani-Tür
چکیده

This paper describes a novel approach of improving multi-document summarization based on cross-document information extraction (IE). We describe a method to automatically incorporate IE results into sentence ranking. Experiments have shown our integration methods can significantly improve a high-performing multi-document summarization system, according to the ROUGE-2 and ROUGE-SU4 metrics (7.38%% relative improvement on ROUGE-2 recall), and the generated summaries are preferred by human subjects (0.78 higher TAC Content score and 0.11 higher Readability/Fluency score).

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multi-document Summarization via Information Extraction: A Revisit

This paper describes a novel approach of improving multi-document summarization based on cross-document information extraction (IE). We first show that IE itself is not sufficient to produce fluent and coherent summaries. Then we attempt various methods to automatically incorporate IE results into sentence ranking. Experiments have shown our integration methods can significantly improve a high-...

متن کامل

AMDS: Sentence Extraction Based Proficient Framework For Multi-Document Summarization

Rapid improvement of electronic documents in World Wide Web has made overload to the users in accessing the information. Therefore, abstracting the primary content from numerous documents related to same topic is highly essential. Summarization of multiple documents helps in valuable decision-making in less time. This paper proposed a framework named Adept Multi-Document Summarization (AMDS) fo...

متن کامل

Using Bilingual Information for Cross-Language Document Summarization

Cross-language document summarization is defined as the task of producing a summary in a target language (e.g. Chinese) for a set of documents in a source language (e.g. English). Existing methods for addressing this task make use of either the information from the original documents in the source language or the information from the translated documents in the target language. In this study, w...

متن کامل

A semantic partition based text mining model for document classification

Feature Extraction is a mechanism used to extract key phrases from any given text documents. This extraction can be weighted, ranked or semantic based. Weighted and Ranking based feature extraction normally assigns scores to extracted words based on various heuristics. Highest scoring words are seen as important. Semantic based extractions normally try to understand word meanings, and words wit...

متن کامل

THUIR at TREC2008: Relevance Feedback Track1

Tsinghua University Information Retrieval Group (THUIR) has participated into the first Relevance Feedback Track of TREC2008. The TMiner search engine has been used as our text retrieval system, because the processing capability and flexibility of this system on large text data has been testified during many years’ Web Track and Terabyte Track. In the track, we studied two approaches: 1) query ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010